Breaking the Madry Defense Model with $L_1$-based Adversarial Examples
نویسندگان
چکیده
The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal L∞ distortion = 0.3. This decision discourages the use of attacks which are not optimized on the L∞ distortion metric. Our experimental results demonstrate that by relaxing the L∞ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average L∞ distortion, have minimal visual distortion. These results call into question the use of L∞ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.
منابع مشابه
L1-based Adversarial Examples
The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal L∞ distortion = 0.3. This decision discourages the use of attacks which are not optimized on the L∞ distortion metric. Our experimental results demonstrate that by relaxing the L∞ constraint ...
متن کاملAPE-GAN: Adversarial Perturbation Elimination with GAN
Although neural networks could achieve state-of-the-art performance while recongnizing images, they often suffer a tremendous defeat from adversarial examples–inputs generated by utilizing imperceptible but intentional perturbation to clean samples from the datasets. How to defense against adversarial examples is an important problem which is well worth researching. So far, very few methods hav...
متن کاملManifold Assumption and Defenses Against Adversarial Perturbations
In the adversarial-perturbation problem of neural networks, an adversary starts with a neural network model F and a point x that F classifies correctly, and applies a small perturbation to x to produce another point x′ that F classifies incorrectly. In this paper, we propose taking into account the inherent confidence information produced by models when studying adversarial perturbations, where...
متن کاملDetecting Adversarial Attacks on Neural Network Policies with Visual Foresight
Deep reinforcement learning has shown promising results in learning control policies for complex sequential decision-making tasks. However, these neural network-based policies are known to be vulnerable to adversarial examples. This vulnerability poses a potentially serious threat to safety-critical systems such as autonomous vehicles. In this paper, we propose a defense mechanism to defend rei...
متن کاملTowards Deep Learning Models Resistant to Adversarial Attacks
Recent work has demonstrated that neural networks are vulnerable to adversarial examples, i.e., inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides a broad and unifying view on much of the prior work...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017